Two Dimensional Evaluation Reinforcement Learning

نویسندگان

  • Hiroyuki Okada
  • Hiroshi Yamakawa
  • Takashi Omori
چکیده

To solve the problem of tradeo between exploration and exploitation actions in reinforcement learning, the authors have proposed two-dimensional evaluation reinforcement learning, which distinguishes between reward and punishment evaluation forecasts. The proposed method uses the di erence between reward evaluation and punishment evaluation as a factor for determining the action and the sum as a parameter for determining the ratio of exploration to exploitation. In this paper we described an experiment with a mobile robot searching for a path and the subsequent con ict between exploration and exploitation actions. The results of the experiment prove that using the proposed method of reinforcement learning using the two dimensions of reward and punishment can generate a better path than using the conventional reinforcement learning method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Realworld Robot Navigation by Two Dimensional Evaluation Reinforcement Learning

The trade-off of exploration and exploitation is present for a learnig method based on the trial and error such as reinforcement learning. We have proposed a reinforcement learning algorism using reward and punishment as repulsive evaluation(2D-RL). In the algorithm, an appropriate balance between exploration and exploitation can be attained by using interest and utility. In this paper, we appl...

متن کامل

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

High-dimensional observations and complex realworld dynamics present major challenges in reinforcement learning for both function approximation and exploration. We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, nonparametric function approximator for learning on Q-function residuals. And second, we propose an exploration strategy...

متن کامل

Gaussian Processes in Reinforcement Learning

We exploit some useful properties of Gaussian process (GP) regression models for reinforcement learning in continuous state spaces and discrete time. We demonstrate how the GP model allows evaluation of the value function in closed form. The resulting policy iteration algorithm is demonstrated on a simple problem with a two dimensional state space. Further, we speculate that the intrinsic abili...

متن کامل

Evaluation of Ultimate Torsional Strength of Reinforcement Concrete Beams Using Finite Element Analysis and Artificial Neural Network

Due to lack of theory of elasticity, estimation of ultimate torsional strength of reinforcement concrete beams is a difficult task. Therefore, the finite element methods could be applied for determination of strength of concrete beams. Furthermore, for complicated, highly nonlinear and ambiguous status, artificial neural networks are appropriate tools for prediction of behavior of such states. ...

متن کامل

Parallel Recombinative Reinforcement Learning: A Genetic Approach

A technique is presented that is suitable for function optimization in high-dimensional binary domains. The method allows an efficient parallel implementation and is based on the combination of genetic algorithms and reinforcement learning schemes. More specifically, a population of probability vectors is considered, each member corresponding to a reinforcement learning optimizer. Each probabil...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001